{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "04_GoogleNews_Cleaner_Splitter_Classification_Aggregator.ipynb",
      "provenance": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "23R6n3CjjMzV"
      },
      "source": [
        "# Obsei Tutorial 04\n",
        "## This example shows following Obsei workflow\n",
        " 1. Observe: Search and fetch news article via Google News\n",
        " 2. Cleaner: Clean article text proerply\n",
        " 3. Analyze: Classify article text while splitting text in small chunks and later computing final inference using given formula"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XQiAxuEqlMpB"
      },
      "source": [
        "## Install Obsei from latest code, perform these steps -\n",
        "- Select GPU RunType for faster computation \n",
        "- Restart Runtime after installation"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "CSwBGl1xis5q",
        "outputId": "cb0676da-e778-46f7-f761-ad59687e125d"
      },
      "source": [
        "!pip install obsei[all]"
      ],
      "execution_count": 1,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Collecting git+https://github.com/lalitpagaria/obsei.git\n",
            "  Cloning https://github.com/lalitpagaria/obsei.git to /tmp/pip-req-build-4p5rgj_4\n",
            "  Running command git clone -q https://github.com/lalitpagaria/obsei.git /tmp/pip-req-build-4p5rgj_4\n",
            "Requirement already satisfied: app-store-reviews-reader==1.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.2)\n",
            "Requirement already satisfied: atlassian-python-api==3.10.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.10.0)\n",
            "Requirement already satisfied: beautifulsoup4==4.9.3 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (4.9.3)\n",
            "Requirement already satisfied: blis==0.7.4 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.7.4)\n",
            "Requirement already satisfied: cachetools==4.2.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (4.2.2)\n",
            "Requirement already satisfied: catalogue==2.0.4 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.0.4)\n",
            "Requirement already satisfied: certifi==2021.5.30 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2021.5.30)\n",
            "Requirement already satisfied: chardet==4.0.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (4.0.0)\n",
            "Requirement already satisfied: click==7.1.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (7.1.2)\n",
            "Requirement already satisfied: courlan==0.4.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.4.0)\n",
            "Requirement already satisfied: cssselect==1.1.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.1.0)\n",
            "Requirement already satisfied: cymem==2.0.5 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.0.5)\n",
            "Requirement already satisfied: dateparser==1.0.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.0.0)\n",
            "Requirement already satisfied: deprecated==1.2.12 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.2.12)\n",
            "Requirement already satisfied: elasticsearch==7.13.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (7.13.1)\n",
            "Requirement already satisfied: feedparser==6.0.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (6.0.2)\n",
            "Requirement already satisfied: filelock==3.0.12 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.0.12)\n",
            "Requirement already satisfied: gnews==0.1.3 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.1.3)\n",
            "Requirement already satisfied: google-api-core==1.30.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.30.0)\n",
            "Requirement already satisfied: google-api-python-client==2.8.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.8.0)\n",
            "Requirement already satisfied: google-auth==1.30.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.30.2)\n",
            "Requirement already satisfied: google-auth-httplib2==0.1.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.1.0)\n",
            "Requirement already satisfied: google-play-scraper==1.0.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.0.0)\n",
            "Requirement already satisfied: googleapis-common-protos==1.53.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.53.0)\n",
            "Requirement already satisfied: greenlet==1.1.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.1.0)\n",
            "Requirement already satisfied: htmldate==0.8.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.8.1)\n",
            "Requirement already satisfied: httplib2==0.19.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.19.1)\n",
            "Requirement already satisfied: huggingface-hub==0.0.8 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.0.8)\n",
            "Requirement already satisfied: idna==2.10 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.10)\n",
            "Requirement already satisfied: importlib-metadata==4.5.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (4.5.0)\n",
            "Requirement already satisfied: jinja2==3.0.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.0.1)\n",
            "Requirement already satisfied: joblib==1.0.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.0.1)\n",
            "Requirement already satisfied: justext==2.2.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.2.0)\n",
            "Requirement already satisfied: lxml==4.6.3 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (4.6.3)\n",
            "Requirement already satisfied: markupsafe==2.0.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.0.1)\n",
            "Requirement already satisfied: mmh3==3.0.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.0.0)\n",
            "Requirement already satisfied: murmurhash==1.0.5 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.0.5)\n",
            "Requirement already satisfied: nltk==3.6.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.6.2)\n",
            "Requirement already satisfied: numpy==1.20.3 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.20.3)\n",
            "Requirement already satisfied: oauthlib==3.1.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.1.1)\n",
            "Requirement already satisfied: packaging==20.9 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (20.9)\n",
            "Requirement already satisfied: pandas==1.2.4 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.2.4)\n",
            "Requirement already satisfied: pathy==0.5.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.5.2)\n",
            "Requirement already satisfied: praw==7.2.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (7.2.0)\n",
            "Requirement already satisfied: prawcore==2.1.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.1.0)\n",
            "Requirement already satisfied: preshed==3.0.5 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.0.5)\n",
            "Requirement already satisfied: presidio-analyzer==2.2.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.2.1)\n",
            "Requirement already satisfied: presidio-anonymizer==2.2.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.2.1)\n",
            "Requirement already satisfied: protobuf==3.17.3 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.17.3)\n",
            "Requirement already satisfied: pyasn1==0.4.8 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.4.8)\n",
            "Requirement already satisfied: pyasn1-modules==0.2.8 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.2.8)\n",
            "Requirement already satisfied: pycryptodome==3.10.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.10.1)\n",
            "Requirement already satisfied: pydantic==1.7.4 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.7.4)\n",
            "Requirement already satisfied: pyparsing==2.4.7 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.4.7)\n",
            "Requirement already satisfied: python-dateutil==2.8.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.8.1)\n",
            "Requirement already satisfied: python-facebook-api==0.9.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.9.2)\n",
            "Requirement already satisfied: pytz==2021.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2021.1)\n",
            "Requirement already satisfied: pyyaml==5.4.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (5.4.1)\n",
            "Requirement already satisfied: readability-lxml==0.8.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.8.1)\n",
            "Requirement already satisfied: reddit-rss-reader==1.3.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.3.2)\n",
            "Requirement already satisfied: regex==2020.11.13 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2020.11.13)\n",
            "Requirement already satisfied: requests==2.25.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.25.1)\n",
            "Requirement already satisfied: requests-file==1.5.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.5.1)\n",
            "Requirement already satisfied: requests-oauthlib==1.3.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.3.0)\n",
            "Requirement already satisfied: rsa==4.7.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (4.7.2)\n",
            "Requirement already satisfied: sacremoses==0.0.45 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.0.45)\n",
            "Requirement already satisfied: searchtweets-v2==1.0.7 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.0.7)\n",
            "Requirement already satisfied: sentencepiece==0.1.95 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.1.95)\n",
            "Requirement already satisfied: sgmllib3k==1.0.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.0.0)\n",
            "Requirement already satisfied: six==1.16.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.16.0)\n",
            "Requirement already satisfied: slack-sdk==3.6.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.6.0)\n",
            "Requirement already satisfied: smart-open==3.0.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.0.0)\n",
            "Requirement already satisfied: soupsieve==2.2.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.2.1)\n",
            "Requirement already satisfied: spacy==3.0.5 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.0.5)\n",
            "Requirement already satisfied: spacy-legacy==3.0.5 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.0.5)\n",
            "Requirement already satisfied: sqlalchemy==1.4.17 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.4.17)\n",
            "Requirement already satisfied: srsly==2.4.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.4.1)\n",
            "Requirement already satisfied: thinc==8.0.4 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (8.0.4)\n",
            "Requirement already satisfied: tld==0.12.6 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.12.6)\n",
            "Requirement already satisfied: tldextract==3.1.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.1.0)\n",
            "Requirement already satisfied: tokenizers==0.10.3 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.10.3)\n",
            "Requirement already satisfied: tqdm==4.61.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (4.61.0)\n",
            "Requirement already satisfied: trafilatura==0.8.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.8.2)\n",
            "Requirement already satisfied: transformers==4.6.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (4.6.1)\n",
            "Requirement already satisfied: tweet-preprocessor==0.6.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.6.0)\n",
            "Requirement already satisfied: typer==0.3.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.3.2)\n",
            "Requirement already satisfied: typing-extensions==3.10.0.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.10.0.0)\n",
            "Requirement already satisfied: tzlocal==2.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.1)\n",
            "Requirement already satisfied: update-checker==0.18.0 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.18.0)\n",
            "Requirement already satisfied: uritemplate==3.0.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.0.1)\n",
            "Requirement already satisfied: urllib3==1.26.5 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.26.5)\n",
            "Requirement already satisfied: vadersentiment==3.3.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.3.2)\n",
            "Requirement already satisfied: wasabi==0.8.2 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (0.8.2)\n",
            "Requirement already satisfied: websocket-client==1.0.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.0.1)\n",
            "Requirement already satisfied: wrapt==1.12.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.12.1)\n",
            "Requirement already satisfied: zenpy==2.0.24 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (2.0.24)\n",
            "Requirement already satisfied: zipp==3.4.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (3.4.1)\n",
            "Requirement already satisfied: torch==1.8.1 in /usr/local/lib/python3.7/dist-packages (from obsei==0.0.9) (1.8.1)\n",
            "Requirement already satisfied: setuptools>=40.3.0 in /usr/local/lib/python3.7/dist-packages (from google-api-core==1.30.0->obsei==0.0.9) (57.2.0)\n",
            "Requirement already satisfied: responses>=0.11 in /usr/local/lib/python3.7/dist-packages (from python-facebook-api==0.9.2->obsei==0.0.9) (0.13.3)\n",
            "Requirement already satisfied: attrs<21.0.0,>=20.1.0 in /usr/local/lib/python3.7/dist-packages (from python-facebook-api==0.9.2->obsei==0.0.9) (20.3.0)\n",
            "Requirement already satisfied: cattrs<2.0,>=1.1 in /usr/local/lib/python3.7/dist-packages (from python-facebook-api==0.9.2->obsei==0.0.9) (1.7.1)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mEIG7Zs-lQVB"
      },
      "source": [
        "## Configure Google News Observer"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ASPOB5Alla7q"
      },
      "source": [
        "from obsei.source.google_news_source import GoogleNewsConfig, GoogleNewsSource\n",
        "\n",
        "source_config = GoogleNewsConfig(\n",
        "    query=\"bitcoin\",\n",
        "    max_results=10,\n",
        "    fetch_article=True,\n",
        "    lookup_period=\"1d\",\n",
        ")\n",
        "\n",
        "source = GoogleNewsSource()"
      ],
      "execution_count": 10,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CJJ06rpalsYB"
      },
      "source": [
        "## Configure TextCleaner as Pre-Processor to clean review text\n",
        "These cleaning function will run serially"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "TksY24crlsy6",
        "outputId": "315f6f9d-ed52-4975-9f14-70f8bc218dee"
      },
      "source": [
        "from obsei.preprocessor.text_cleaner import TextCleaner, TextCleanerConfig\n",
        "from obsei.preprocessor.text_cleaning_function import *\n",
        "\n",
        "text_cleaner_config = TextCleanerConfig(\n",
        "    cleaning_functions = [\n",
        "        ToLowerCase(),\n",
        "        RemoveWhiteSpaceAndEmptyToken(),\n",
        "        RemovePunctuation(),\n",
        "        RemoveSpecialChars(),\n",
        "        DecodeUnicode(),\n",
        "        RemoveStopWords(),\n",
        "        RemoveWhiteSpaceAndEmptyToken(),\n",
        "   ]\n",
        ")\n",
        "\n",
        "text_cleaner = TextCleaner()"
      ],
      "execution_count": 14,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[nltk_data] Downloading package stopwords to /root/nltk_data...\n",
            "[nltk_data]   Package stopwords is already up-to-date!\n"
          ],
          "name": "stderr"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "z45pE_BVl_gu"
      },
      "source": [
        "## Configure Classification Analyzer\n",
        "\n",
        "- List of categories in `labels`\n",
        "- `TextSplitterConfig` with proper `max_split_length` and `split_stride`\n",
        "- `InferenceAggregatorConfig` with required `aggregate_function` currently two are supported (average and max frequent class)\n",
        "- `ClassificationMaxCategories` need `score_threshold` which is used to determine what minimum probability needed to take a class into consideration\n",
        "\n",
        "**Note**: Select model from https://huggingface.co/models?pipeline_tag=zero-shot-classification, if you want to try different one"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "RGzOD5yrl_CB"
      },
      "source": [
        "from obsei.analyzer.classification_analyzer import ClassificationAnalyzerConfig, ZeroShotClassificationAnalyzer\n",
        "from obsei.postprocessor.inference_aggregator import InferenceAggregatorConfig\n",
        "from obsei.postprocessor.inference_aggregator_function import ClassificationMaxCategories\n",
        "from obsei.preprocessor.text_splitter import TextSplitterConfig\n",
        "\n",
        "analyzer_config=ClassificationAnalyzerConfig(\n",
        "   labels=[\"buy\", \"sell\", \"going up\", \"going down\"],\n",
        "   use_splitter_and_aggregator=True,\n",
        "   splitter_config=TextSplitterConfig(\n",
        "       max_split_length=300,\n",
        "       split_stride=3\n",
        "   ),\n",
        "   aggregator_config=InferenceAggregatorConfig(\n",
        "       aggregate_function=ClassificationMaxCategories(\n",
        "           score_threshold=0.3\n",
        "       )\n",
        "   )\n",
        ")\n",
        "\n",
        "text_analyzer = ZeroShotClassificationAnalyzer(\n",
        "   model_name_or_path=\"typeform/mobilebert-uncased-mnli\",\n",
        "   device=\"auto\"\n",
        ")"
      ],
      "execution_count": 11,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ScovHM79oMLo"
      },
      "source": [
        "## Search and fetch news article"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "WniWcEzeoOKx",
        "outputId": "a7b1c1e5-fc44-4b67-8bbe-fa717a5f48bf"
      },
      "source": [
        "source_response_list = source.lookup(source_config)"
      ],
      "execution_count": 12,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "07/29/2021 19:08:37 - INFO - urllib3.poolmanager -   Redirecting https://www.bloomberg.com/news/articles/2021-07-29/brokers-sought-for-78-million-bitcoin-stash-from-finland-bust -> https://www.bloomberg.com/tosv2.html?vid=&uuid=60bfaca0-f0a0-11eb-9e4f-53da6d852d97&url=L25ld3MvYXJ0aWNsZXMvMjAyMS0wNy0yOS9icm9rZXJzLXNvdWdodC1mb3ItNzgtbWlsbGlvbi1iaXRjb2luLXN0YXNoLWZyb20tZmlubGFuZC1idXN0\n",
            "07/29/2021 19:08:37 - INFO - trafilatura.core -   using custom extraction: None\n",
            "07/29/2021 19:08:37 - INFO - trafilatura.core -   not enough comments None\n",
            "07/29/2021 19:08:37 - INFO - urllib3.poolmanager -   Redirecting https://www.bloomberg.com/news/articles/2021-07-29/new-ira-product-allows-for-tax-free-bitcoin-mining -> https://www.bloomberg.com/tosv2.html?vid=&uuid=60f22e50-f0a0-11eb-90b4-6d36db3c27b3&url=L25ld3MvYXJ0aWNsZXMvMjAyMS0wNy0yOS9uZXctaXJhLXByb2R1Y3QtYWxsb3dzLWZvci10YXgtZnJlZS1iaXRjb2luLW1pbmluZw==\n",
            "07/29/2021 19:08:37 - INFO - trafilatura.core -   using custom extraction: None\n",
            "07/29/2021 19:08:37 - ERROR - trafilatura.core -   not enough text None\n",
            "07/29/2021 19:08:37 - INFO - trafilatura.core -   not enough comments None\n",
            "07/29/2021 19:08:38 - INFO - trafilatura.core -   using custom extraction: None\n",
            "07/29/2021 19:08:38 - INFO - trafilatura.core -   not enough comments None\n",
            "07/29/2021 19:08:38 - INFO - readability.readability -   ruthless removal did not work. \n",
            "07/29/2021 19:08:38 - INFO - trafilatura.core -   using custom extraction: None\n",
            "07/29/2021 19:08:38 - INFO - trafilatura.core -   not enough comments None\n",
            "07/29/2021 19:08:39 - INFO - trafilatura.core -   using custom extraction: None\n",
            "07/29/2021 19:08:39 - INFO - trafilatura.core -   not enough comments None\n",
            "07/29/2021 19:08:39 - INFO - trafilatura.core -   using custom extraction: None\n",
            "07/29/2021 19:08:39 - INFO - trafilatura.core -   not enough comments None\n",
            "07/29/2021 19:08:40 - INFO - trafilatura.core -   using custom extraction: None\n",
            "07/29/2021 19:08:40 - INFO - trafilatura.core -   not enough comments None\n",
            "07/29/2021 19:08:40 - INFO - trafilatura.core -   using custom extraction: None\n",
            "07/29/2021 19:08:40 - INFO - trafilatura.core -   not enough comments None\n",
            "07/29/2021 19:08:40 - INFO - readability.readability -   ruthless removal did not work. \n",
            "07/29/2021 19:08:40 - INFO - trafilatura.core -   using generic algorithm: None\n",
            "07/29/2021 19:08:40 - INFO - trafilatura.core -   not enough comments None\n",
            "07/29/2021 19:08:40 - INFO - trafilatura.core -   text and comments not long enough: 0 0\n",
            "07/29/2021 19:08:40 - INFO - trafilatura.core -   discarding data for url: None\n",
            "07/29/2021 19:08:41 - INFO - trafilatura.core -   using custom extraction: None\n",
            "07/29/2021 19:08:41 - INFO - trafilatura.core -   not enough comments None\n"
          ],
          "name": "stderr"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KEvZ32HZoY57"
      },
      "source": [
        "## PreProcess text to clean it"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "OpFEbn1joYbK"
      },
      "source": [
        "cleaner_response_list = text_cleaner.preprocess_input(\n",
        "    input_list=source_response_list,\n",
        "    config=text_cleaner_config\n",
        ")"
      ],
      "execution_count": 15,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wsh3e3uCogxK"
      },
      "source": [
        "## Analyze article to perform classification\n",
        "**Note**: This is compute heavy step"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "YSe3F6EyohX9",
        "outputId": "a33d01b6-d632-4475-9ae8-abd46d7b45f0"
      },
      "source": [
        "analyzer_response_list = text_analyzer.analyze_input(\n",
        "    source_response_list=cleaner_response_list,\n",
        "    analyzer_config=analyzer_config\n",
        ")"
      ],
      "execution_count": 16,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.\n"
          ],
          "name": "stderr"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YS9Oc4DrovN_"
      },
      "source": [
        "## Print Result"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Kl7OIaDxov3L",
        "outputId": "38520cd8-c8a9-4212-ade8-99db905bc6b7"
      },
      "source": [
        "for analyzer_response in analyzer_response_list:\n",
        "  print(vars(analyzer_response))"
      ],
      "execution_count": 17,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "{'segmented_data': {'aggregator_data': {'category_count': {'positive': 2, 'going up': 2, 'sell': 1}, 'max_scores': {'positive': 0.806824266910553, 'going up': 0.5611677169799805, 'sell': 0.5141412019729614}, 'aggregator_name': 'ClassificationMaxCategories'}}, 'meta': {'title': 'Bitcoin (BTC USD) Cryptocurrency Price News: Finland Seeks Broker to Sell Stash - Bloomberg', 'description': 'Bitcoin (BTC USD) Cryptocurrency Price News: Finland Seeks Broker to Sell Stash  Bloomberg', 'published date': 'Thu, 29 Jul 2021 13:11:21 GMT', 'url': 'https://www.bloomberg.com/news/articles/2021-07-29/brokers-sought-for-78-million-bitcoin-stash-from-finland-bust', 'publisher': {'href': 'https://www.bloomberg.com', 'title': 'Bloomberg'}, 'extracted_data': {'title': 'Bloomberg', 'author': None, 'hostname': None, 'date': None, 'categories': '', 'tags': '', 'fingerprint': 'BecpvREYR0Bqj6DjTeoRthAFuAs=', 'id': '6e25ac22', 'source': None, 'source-hostname': 'Are you a robot?', 'excerpt': None}}, 'source_name': 'GoogleNews', 'processed_text': 'bitcoin btc usd cryptocurrency price news finland seeks broker sell stash bloomberg continue please click box let us know robot please make sure browser supports javascript cookies blocking loading information review terms service cookie policy inquiries related message please contact support team team provide reference id'}\n",
            "{'segmented_data': {'aggregator_data': {'category_count': {'positive': 1, 'going up': 1}, 'max_scores': {'positive': 0.7125387787818909, 'going up': 0.7029632329940796}, 'aggregator_name': 'ClassificationMaxCategories'}}, 'meta': {'title': 'Bitcoin (BTC USD) Cryptocurrency Mining Gets Tax Savings With New IRA - Bloomberg', 'description': 'Bitcoin (BTC USD) Cryptocurrency Mining Gets Tax Savings With New IRA  Bloomberg', 'published date': 'Thu, 29 Jul 2021 14:00:00 GMT', 'url': 'https://www.bloomberg.com/news/articles/2021-07-29/new-ira-product-allows-for-tax-free-bitcoin-mining', 'publisher': {'href': 'https://www.bloomberg.com', 'title': 'Bloomberg'}, 'extracted_data': {'title': 'Bloomberg', 'author': None, 'hostname': None, 'date': None, 'categories': '', 'tags': '', 'fingerprint': '1f8W9+brqRuBfWvFtSJ5NU8s49g=', 'id': '10fba49b', 'source': None, 'source-hostname': 'Are you a robot?', 'excerpt': None}}, 'source_name': 'GoogleNews', 'processed_text': 'bitcoin btc usd cryptocurrency mining gets tax savings new ira bloomberg continue please click box let us know robot terms service cookie policy contact support team provide reference id'}\n",
            "{'segmented_data': {'aggregator_data': {'category_count': {'going up': 7, 'positive': 8, 'going down': 2, 'negative': 1}, 'max_scores': {'going up': 0.9746450185775757, 'positive': 0.8058925867080688, 'going down': 0.8555613160133362, 'negative': 0.594179630279541}, 'aggregator_name': 'ClassificationMaxCategories'}}, 'meta': {'title': 'Bitcoin price hovers around $40000 - Fox Business', 'description': 'Bitcoin price hovers around $40000  Fox Business', 'published date': 'Thu, 29 Jul 2021 08:40:26 GMT', 'url': 'https://www.foxbusiness.com/markets/bitcoin-price-7-29', 'publisher': {'href': 'https://www.foxbusiness.com', 'title': 'Fox Business'}, 'extracted_data': {'title': 'Bitcoin price hovers around $40,000', 'author': 'Ken Martin', 'hostname': 'foxbusiness.com', 'date': '2021-07-29', 'categories': '', 'tags': '', 'fingerprint': '0bin0zBplkMxTzzjw7s2HBTsoi4=', 'id': '1057f820', 'source': 'https://www.foxbusiness.com/markets/bitcoin-price-7-29', 'source-hostname': 'Fox Business', 'excerpt': None}}, 'source_name': 'GoogleNews', 'processed_text': 'bitcoin price hovers around 40000 fox business bitcoin price hovers around 40000 infrastructure deal agreed upon lawmakers president biden includes crackdown cryptocurrencies one way raise revenue bitcoin slightly higher thursday morning trading 40000 level bitcoin slightly higher thursday morning morning trading 40000 level price around 39800 per coin rivals ethereum dogecoin trading around 2290 20 cents per coin respectively according coindesk ethereum dogecoin trading around 2290 20 cents per coin respectively according coindesk crypto crackdown help finance bipartisan infrastructure bill bill crypto crackdown help finance bipartisan infrastructure bill infrastructure deal agreed upon lawmakers president biden wednesday includes crackdown cryptocurrencies one way raise revenue support spending infrastructure deal agreed upon lawmakers president biden wednesday includes crackdown crackdown cryptocurrencies one way raise revenue support spending get fox business go clicking get fox business go clicking order finance 500 billion new spending infrastructure initiatives group said intends strengthen tax enforcement comes cryptocurrencies previously reported fox business irs irs commissioner charles rettig requested broader authority congress june collect information cryptocurrency transactions previously reported fox business irs commissioner charles rettig requested broader authority congress june collect information cryptocurrency transactions rettig said said transactions design often radar screens noting recent market cap crypto world exceeded 2 trillion 8600 exchanges worldwide robinhood prices ipo 38 per share raising 2b robinhood prices ipo 38 per share raising 2b news robinhood market inc begin trading public company thursday following pricing pricing initial public offering robinhood market inc begin trading public company thursday following pricing initial public offering company says priced 55 million shares 38 share comes low end 38 42 offering range offering raised 2 billion puts robinhood valuation 32 billion shares begin trading trading nasdaq ticker symbol hood click read fox business click read fox business company reportedly working new feature help protect users crypto price volatility hiring manager formerly new feature help protect users crypto price volatility hiring manager formerly google improve overall product product design according coindesk fox business brittany de lea contributed report'}\n",
            "{'segmented_data': {'aggregator_data': {'category_count': {'going down': 1, 'positive': 1, 'going up': 1}, 'max_scores': {'going down': 0.5592474937438965, 'positive': 0.66602623462677, 'going up': 0.6636335253715515}, 'aggregator_name': 'ClassificationMaxCategories'}}, 'meta': {'title': 'Institutional investors are bullish on bitcoin again, based on this key data point - CNBC', 'description': 'Institutional investors are bullish on bitcoin again, based on this key data point  CNBC', 'published date': 'Wed, 28 Jul 2021 21:10:13 GMT', 'url': 'https://www.cnbc.com/2021/07/28/institutional-investors-are-bullish-on-bitcoin-again-based-on-this-key-data-point.html', 'publisher': {'href': 'https://www.cnbc.com', 'title': 'CNBC'}, 'extracted_data': {'title': 'Institutional investors are bullish on bitcoin again, based on this key data point', 'author': 'Tanaya Macheel', 'hostname': 'cnbc.com', 'date': '2021-07-28', 'categories': 'Markets', 'tags': 'Coinbase Global Inc,Cryptocurrency,Bitcoin,Markets,business news;Coinbase Global Inc;Cryptocurrency;Bitcoin;Markets', 'fingerprint': '+Q2CPT89AAI5Vn5g8fu+SDmyvtE=', 'id': '7a9162df', 'source': 'https://www.cnbc.com/2021/07/28/institutional-investors-are-bullish-on-bitcoin-again-based-on-this-key-data-point.html', 'source-hostname': 'CNBC', 'excerpt': 'About $2.5 billion in bitcoin moved off crypto exchanges Wednesday morning.'}}, 'source_name': 'GoogleNews', 'processed_text': 'institutional investors bullish bitcoin based key data point cnbc 25 billion bitcoin moved crypto exchanges wednesday morning signal institutional investors getting sidelines bearish weeks cryptocurrency balance bitcoin exchanges fell 63289 btc transferred platforms according blockchain data data provider glassnode tracks flows exchanges including coinbase kraken binance coinbase kraken binance cryptocurrency'}\n",
            "{'segmented_data': {'aggregator_data': {'category_count': {'going up': 4, 'positive': 1, 'negative': 3, 'going down': 1}, 'max_scores': {'going up': 0.9795074462890625, 'positive': 0.8666032552719116, 'negative': 0.6999003887176514, 'going down': 0.6623135209083557}, 'aggregator_name': 'ClassificationMaxCategories'}}, 'meta': {'title': 'Bitcoin Overbought at $40K Resistance; Support at $34K-$36K - CoinDesk - CoinDesk', 'description': 'Bitcoin Overbought at $40K Resistance; Support at $34K-$36K - CoinDesk  CoinDesk', 'published date': 'Thu, 29 Jul 2021 11:26:51 GMT', 'url': 'https://www.coindesk.com/bitcoin-overbought-at-40k-resistance-support-at-34k-36k', 'publisher': {'href': 'https://www.coindesk.com', 'title': 'CoinDesk'}, 'extracted_data': {'title': 'Bitcoin Overbought at $40K Resistance; Support at $34K-$36K - CoinDesk', 'author': 'Damanick Dantes; Damanick Dantes', 'hostname': 'coindesk.com', 'date': '2021-07-29', 'categories': 'Markets', 'tags': '', 'fingerprint': '+/Yd9ybp8pjiIf4k9x8lJMNK1D8=', 'id': '33c1eec8', 'source': 'https://www.coindesk.com/bitcoin-overbought-at-40k-resistance-support-at-34k-36k', 'source-hostname': 'CoinDesk', 'excerpt': 'Bitcoin (BTC) is re-testing the $40K resistance level and appears overbought. Buyers could take profits as short-term momentum wanes.'}}, 'source_name': 'GoogleNews', 'processed_text': 'bitcoin overbought 40k resistance support 34k 36k coindesk coindesk bitcoin btc completed recovery monday 10 pullback retesting 40000 resistance level cryptocurrency appears overbought could trigger profit taking near 25 rally past week btc completed recovery monday 10 pullback retesting 40000 40000 resistance level cryptocurrency appears overbought could trigger profit taking near 25 rally past week lower support seen around 34000 36000 middle twomonth range relative strength index rsi fourhour chart declining extreme overbought reading monday lower high rsi indicates bearish divergence divergence could stall bitcoin shortterm uptrend initial support seen 50period moving average fourhour chart currently 36000 lower support around 32000 34000 could stabilize pullback intermediateterm trend improving significant loss downside momentum past weeks buyers could remain active lower lower support levels although breakout 40000 45000 needed resume longterm uptrend strict set editorial policies coindesk independent operating subsidiary digital currency group invests cryptocurrencies blockchain startups'}\n",
            "{'segmented_data': {'aggregator_data': {'category_count': {'going up': 15, 'positive': 13, 'going down': 7, 'negative': 6}, 'max_scores': {'going up': 0.9941290616989136, 'positive': 0.7976483702659607, 'going down': 0.9561280608177185, 'negative': 0.9916792511940002}, 'aggregator_name': 'ClassificationMaxCategories'}}, 'meta': {'title': 'Market Wrap: Bitcoin Expected to Pause Before Next Rally - CoinDesk - CoinDesk', 'description': 'Market Wrap: Bitcoin Expected to Pause Before Next Rally - CoinDesk  CoinDesk', 'published date': 'Wed, 28 Jul 2021 20:39:06 GMT', 'url': 'https://www.coindesk.com/market-wrap-bitcoin-expected-to-pause-before-next-rally', 'publisher': {'href': 'https://www.coindesk.com', 'title': 'CoinDesk'}, 'extracted_data': {'title': 'Market Wrap: Bitcoin Expected to Pause Before Next Rally - CoinDesk', 'author': 'Damanick Dantes; Frances Yue; Damanick Dantes; Frances Yue', 'hostname': 'coindesk.com', 'date': '2021-07-28', 'categories': 'Markets', 'tags': '', 'fingerprint': 'SPuuujtthDxGiJ7Dt6ayWt7Rg1g=', 'id': '9fc9b88e', 'source': 'https://www.coindesk.com/market-wrap-bitcoin-expected-to-pause-before-next-rally', 'source-hostname': 'CoinDesk', 'excerpt': 'Analysts expect bitcoin (BTC) to pause around $40K resistance before the next leg up. Sentiment is approving, but appears overbought.'}}, 'source_name': 'GoogleNews', 'processed_text': 'market wrap bitcoin expected pause next rally coindesk coindesk bitcoin buyers profittaking mode cryptocurrency tests 40000 resistance level sentiment significantly improved past week although analysts think time pause another leg higher btc easily broke 35k think probably harder time going 40k time time justin chuh senior trader wave financial wrote email coindesk wave financial wrote email coindesk miners sellers coming cash buyers unable push higher absorbing hit chuh wrote latest prices cryptocurrencies traditional markets p 500 44039 0056 gold 18081 05 10year treasury yield closed 1233 1233 compared 1238 moving average watch sentiment easily shift bullish bearish bitcoin remains consolidation phase strong overhead resistance btc already rejected 200day moving average like early june try breather hopefully crawling lower 35k chuh wrote bitcoin cross 200day signal confidence market market demonstrate many players bulls regained control market alexandra clark trader ukbased digitalasset broker globalblock wrote email coindesk globalblock wrote email coindesk trading activity sharply higher compared june shortdated call options actively traded wednesday morning bitcoin bitcoin approached 40000 according data skew gbtc discount narrows grayscale bitcoin trust gbtc shares narrowed discount relative underlying cryptocurrency held fund possibly sign buyers using vehicle bet recent recovery rally digitalasset markets narrowed discount relative underlying cryptocurrency cryptocurrency held fund possibly sign buyers using vehicle bet recent recovery rally digitalasset markets gbtc shares traded discount 66 net asset value nav tuesday smallest margin since june 22 based data provided crypto derivatives research firm skew discount widened 15 midjune skew discount discount widened 15 midjune investors may snapped gbtc shares hopes discount evaporate bull revival bitcoin scenario buyers would reap price gains bitcoin pocketing extra profit narrowing discount grayscale investments manages trust unit digital currency group also owns coindesk ether trading trading volumes surge ether market grew three times faster bitcoin market first six months year large investors diversified native token ethereum blockchain according crypto exchange coinbase halfyearly review published monday grew three times faster bitcoin market first six months year large large investors diversified native token ethereum blockchain according crypto exchange coinbase halfyearly review published monday crypto ceos bullish crypto investors endured one toughest quarters record despite recent rebound fears overregulation clampdown mining china environmental concerns concerns contributed negative sentiment sector coindesk 20 assets constitute 99 crypto market verifiable volume finished second quarter negative returns coindesk bitcoin price index xbx fell 40 thirdworst quarter ever conversely coindesk ether price index etx ended quarter 187 bitcoin recovered recovered losses level optimism far start second quarter crypto ceos however still expect sixfigure bitcoin price saying mediumterm outlook crypto market positive even sentiment coindesk canny reports reports stablecoins spotlight stablecoins existed roughly seven years talk never heated recent recent weeks within crypto community also among regulators traditional market investors much going world stablecoins recently overwhelming three big things happening three big things happening tether cloud traded cryptocurrency market usdt become backbone entire cryptocurrency ecosystem half bitcoin bitcoin trades made however tether company behind digital token plagued regulatory issues regulatory heat stablecoins total market capitalization 116 billion monday almost fourfold increase since start year according coinmarketcap growth increased attention us regulators circle going public public stablecoin issuers disclose info circle issuer usdc second largest stablecoin also spotlight circle plans go public merger concord acquisition corp publicly traded special purpose acquisition corporation spac deal would value crypto financial services firm 45 billion another stablecoin issuer issuer paxos also released first time breakdown reserves stablecoins paxos standard binancelabeled busd 96 reserves held cash cash equivalents 4 invested us treasury bills june 30 altcoin roundup xrp rallies xrp cryptocurrency used ripple payments network rallied fiveweek high wednesday company said said targeting 18 billion filipino remittance market cryptocurrency changed hands 074 european hours hitting highest level since june 21 representing 13 gain day according coindesk 20 data ether trading volume surges ether trading volume totaled 14 trillion januarytojune period 1461 rise 92 billion billion observed first half last year burger king brazil accepts dogecoin burger king brazil accepts dogecoin doge 264 payment method purchase fastfood chain dogpper dog snack service available since monday according company official website though users check availability delivery region company company said dogpper dog treat plays name burger king bestknown menu item whopper costs 3 doge company recommends purchasing maximum five units per order availability reasons relevant news luxor technologies launches index crypto mining stocks mining difficulty expected increase first time since since china crackdown ubs mulls offering prime brokerage services crypto etps european hedge funds sources grayscale bitcoin trust discount narrows unlocks pass ftx renames blockfolio trading app ftx robinhood investigation finra registration violation markets digital assets coindesk 20 ended higher higher wednesday notable winners 2100 utc 400 pm et notable losers yearn finance yfi 023 yearn finance yfi 023 strict set editorial policies coindesk independent operating subsidiary digital currency group invests cryptocurrencies blockchain startups'}\n",
            "{'segmented_data': {'aggregator_data': {'category_count': {'going up': 8, 'positive': 4, 'negative': 3, 'going down': 1}, 'max_scores': {'going up': 0.9944525361061096, 'positive': 0.8424510955810547, 'negative': 0.7220838665962219, 'going down': 0.5190926194190979}, 'aggregator_name': 'ClassificationMaxCategories'}}, 'meta': {'title': \"Bitcoin is at a 'do-or-die' moment and could surge if it can hold above $40,000, analysts say - Markets Insider\", 'description': \"Bitcoin is at a 'do-or-die' moment and could surge if it can hold above $40,000, analysts say  Markets Insider\", 'published date': 'Thu, 29 Jul 2021 10:24:13 GMT', 'url': 'https://markets.businessinsider.com/news/currencies/bitcoin-price-outlook-btc-rally-40000-level-crypto-elon-musk-2021-7', 'publisher': {'href': 'https://markets.businessinsider.com', 'title': 'Markets Insider'}, 'extracted_data': {'title': \"Bitcoin is at a 'do-or-die' moment and could surge if it holds above $40,000, analysts say\", 'author': 'Harry Robertson', 'hostname': 'businessinsider.com', 'date': '2021-07-29', 'categories': '', 'tags': '', 'fingerprint': '5KmRiFmCkYpMa+BZs3PF/o1sB+g=', 'id': 'dd6031b3', 'source': 'https://markets.businessinsider.com/news/currencies/bitcoin-price-outlook-btc-rally-40000-level-crypto-elon-musk-2021-7', 'source-hostname': 'markets.businessinsider.com', 'excerpt': 'Bitcoin has climbed sharply in recent days. REUTERS/Dado Ruvic Bitcoin could surge if it can hold on to its recent gains and consolidate above...'}}, 'source_name': 'GoogleNews', 'processed_text': 'bitcoin doordie moment could surge hold 40000 analysts say markets insider bitcoin doordie moment could surge holds 40000 analysts say bitcoin could surge hold recent gains consolidate 40000 analysts said kraken analysts said token facing doordie moment could hit new alltime high yet many risks risks horizon particularly threat regulation around world bitcoin facing doordie moment could move sharply higher hold stellar weekly gains taken roughly 40000 crypto analysts said bitcoin facing doordie moment could move sharply higher hold stellar weekly gains taken roughly 40000 crypto analysts analysts said world biggest cryptocurrency risen around 17 week far 40403 thursday according bloomberg data positive comments elon musk rumours amazon could accept crypto payments elon musk rumours amazon could accept crypto payments analysts kraken crypto exchange said past week explosive explosive marketwide shift sentiment seems converted bears bulls said bitcoin need consolidate key psychological level 40000 investors feel confident buying token pushing price higher right doordie bulls said note given btc struggles cracking 40000 42000 resistance past bulls however need turn 40000 40000 support look breakout several months rangebound trading 30000 42000 analysts added odds btc scoring new alltime high yearend improved read veteran columnist breaks 3 things every crypto newbie know trading 3 common mistakes avoid veteran columnist breaks 3 things every crypto newbie know know trading 3 common mistakes avoid alexandra clark sales trader digital asset broker globalblock said rebound 1 billion worth short crypto positions liquidated bitcoin dominance inches closer 50 bitcoin accounted around 48 total crypto market thursday according coinmarketcap jpmorgan crypto expert expert nikolaos panigirtzoglou said bitcoin dominance rising 50 would signal momentum building coinmarketcap jpmorgan crypto expert nikolaos panigirtzoglou said bitcoin dominance rising 50 would signal momentum building analysts said another bullish signal fact amount bitcoin exchanges fell sharply sharply wednesday suggesting big buyers moving crypto storage yet major risks outlook bitcoin key one threat tougher regulations senator elizabeth warren week pushing tougher rules protect consumers wildly volatile market pushing tougher rules protect consumers wildly volatile market wrote treasury treasury secretary janet yellen say us financial stability oversight council must act quickly use statutory authority address cryptocurrencies risks regulate market ensure safety stability consumers financial system business insider'}\n",
            "{'segmented_data': {'aggregator_data': {'category_count': {'going down': 3, 'negative': 4, 'going up': 8, 'positive': 7}, 'max_scores': {'going down': 0.9420739412307739, 'negative': 0.8024093508720398, 'going up': 0.9577222466468811, 'positive': 0.7244963645935059}, 'aggregator_name': 'ClassificationMaxCategories'}}, 'meta': {'title': 'Bitcoin hash rate rebounds as major miners are coming back online - Cointelegraph', 'description': 'Bitcoin hash rate rebounds as major miners are coming back online  Cointelegraph', 'published date': 'Thu, 29 Jul 2021 10:55:13 GMT', 'url': 'https://cointelegraph.com/news/bitcoin-hash-rate-rebounds-as-major-miners-are-coming-back-online', 'publisher': {'href': 'https://cointelegraph.com', 'title': 'Cointelegraph'}, 'extracted_data': {'title': 'Bitcoin hash rate rebounds as major miners are coming back online', 'author': 'Arijit Sarkar', 'hostname': 'cointelegraph.com', 'date': '2021-07-29', 'categories': 'Latest News', 'tags': '#Bitcoin;#China;#Bitcoin Regulation;#Bitcoin Mining;#Regulation;#Hash Rate', 'fingerprint': 'UlZKhwOEas+O9knXhORLHUGtReU=', 'id': 'ae806a36', 'source': 'https://cointelegraph.com/news/bitcoin-hash-rate-rebounds-as-major-miners-are-coming-back-online', 'source-hostname': 'Cointelegraph', 'excerpt': 'Restrictions in China have forced homegrown Bitcoin miners and miners to move out to crypto-friendly nations.'}}, 'source_name': 'GoogleNews', 'processed_text': 'bitcoin hash rate rebounds major miners coming back online cointelegraph china stringent crypto regulations meant closing shop many chinese businesses within bitcoin btc mining ecosystem sudden disappearance bitcoin miners grid resulted falling hash rates hashing performance cumulative computing computing power bitcoin network dropped alltime high 180 exahashes per second ehs 84 ehs 21 days btc mining ecosystem sudden disappearance bitcoin miners grid resulted falling hash rates hashing performance cumulative computing power bitcoin network dropped alltime high 180 exahashes per second ehs ehs 84 ehs 21 days hash rate drop directly attributable drop number chinese miners blockchaincom explorer data suggests steady increase mining difficulty since june 3 suggests steady increase mining difficulty since june 3 since drop hash rate increased 2138 owing return migrating chinese miners miners started operating regions resulting adjustment bitcoin mining difficulty translates higher computational costs formerly chinabased miners come back online operational costs bitcoin miners worldwide continue increase given initial resistance chinese government miners lookout countries offers offers regulatory clarity lower electricity costs initial resistance chinese government miners lookout countries offers regulatory clarity lower electricity costs related mine bitcoin everything need know mine bitcoin everything need know pretext shielding citizens highrisk investments chinese chinese authorities forced crypto businesses highly limit crypto portfolio offerings move offshore reported cointelegraph earlier month wang juana member china oecd blockchain expert policy advisory board stated reported cointelegraph earlier month wang juana member china oecd blockchain expert expert policy advisory board stated seeing cryptocurrency market enter path dechinaisation first trading computing power based series stronger steps taken cryptocurrencies bitcoin mining last week beijing peak september 2019 china contributed 7553 global bitcoin hash rate shown steady decline way way mining ban imposed china current hash rate contribution stands 4604 united states expanded share 1685 globally contributed 7553 global bitcoin hash rate shown steady decline way mining ban imposed china current hash rate contribution stands 4604 united states expanded share 1685 globally globally cointelegraph also covered instances jurisdictions including russia kazakhstan canada seen greater involvement crypto offering home migrating chinese miners many experts agree china shattered monopoly mining industry signals positive move toward decentralization crypto ecosystem russia russia kazakhstan canada seen greater involvement crypto offering home migrating chinese miners many experts agree china shattered monopoly mining industry signals positive move toward decentralization crypto ecosystem'}\n",
            "{'segmented_data': {'aggregator_data': {'category_count': {'positive': 1, 'going up': 1, 'negative': 1}, 'max_scores': {'positive': 0.6331064701080322, 'going up': 0.49385496973991394, 'negative': 0.42282581329345703}, 'aggregator_name': 'ClassificationMaxCategories'}}, 'meta': {'title': 'Corporate crypto 101: How companies are using Bitcoin and other digital currency - Fortune', 'description': 'Corporate crypto 101: How companies are using Bitcoin and other digital currency  Fortune', 'published date': 'Thu, 29 Jul 2021 09:18:00 GMT', 'url': 'https://fortune.com/2021/07/29/companies-using-bitcoin-btc-crypto-101/', 'publisher': {'href': 'https://fortune.com', 'title': 'Fortune'}, 'extracted_data': None}, 'source_name': 'GoogleNews', 'processed_text': 'corporate crypto companies using bitcoin digital currency fortune'}\n",
            "{'segmented_data': {'aggregator_data': {'category_count': {'going down': 3, 'negative': 4, 'going up': 7, 'positive': 5}, 'max_scores': {'going down': 0.991021990776062, 'negative': 0.6630723476409912, 'going up': 0.9641163349151611, 'positive': 0.774677038192749}, 'aggregator_name': 'ClassificationMaxCategories'}}, 'meta': {'title': 'Why Bitcoin, Ethereum, and Other Cryptocurrencies Dropped Today - Motley Fool', 'description': 'Why Bitcoin, Ethereum, and Other Cryptocurrencies Dropped Today  Motley Fool', 'published date': 'Thu, 29 Jul 2021 14:23:00 GMT', 'url': 'https://www.fool.com/investing/2021/07/29/why-bitcoin-ethereum-and-other-cryptocurrencies-dr/', 'publisher': {'href': 'https://www.fool.com', 'title': 'Motley Fool'}, 'extracted_data': {'title': 'Why Bitcoin, Ethereum, and Other Cryptocurrencies Dropped Today | The Motley Fool', 'author': 'Howard Smith', 'hostname': 'fool.com', 'date': '2021-07-29', 'categories': 'investing', 'tags': 'cryptocurrency', 'fingerprint': 'wF6P1vrN7NT9W2UoH2NTw9FaT6k=', 'id': 'e6eb6e49', 'source': 'https://www.fool.com/investing/2021/07/29/why-bitcoin-ethereum-and-other-cryptocurrencies-dr/', 'source-hostname': 'The Motley Fool', 'excerpt': 'A new cryptocurrency tax may help fund the infrastructure bill.'}}, 'source_name': 'GoogleNews', 'processed_text': 'bitcoin ethereum cryptocurrencies dropped today motley fool happened first thought ongoing discussions among politicians regarding large infrastructure bill nt anything bitcoin crypto btc cryptocurrency us senate voted yesterday advance formal negotiations approximately 1 trillion infrastructure infrastructure package cryptocurrencies became part discussion vote one reasons bitcoin ethereum crypto eth dogecoin crypto doge ripple crypto xrp lower today cryptocurrencies opened 2 5 945 edt bitcoin ethereum dogecoin less 1 lower ripple still 35 crypto btc cryptocurrency us senate voted voted yesterday advance formal negotiations approximately 1 trillion infrastructure package cryptocurrencies became part discussion vote one reasons bitcoin ethereum crypto eth dogecoin crypto doge ripple crypto xrp lower today cryptocurrencies opened 2 5 945 edt bitcoin ethereum dogecoin less 1 1 lower ripple still 35 last night senate voted 6732 advance bipartisan infrastructure bill vote included 17 republicans agreeing move legislation forward addition news related new tax retail investors eyeing today initial public offering ipo retail brokerage robinhood nasdaq hood allows trading trading bitcoin ethereum dogecoin cryptocurrencies initial public offering ipo retail brokerage robinhood nasdaq hood allows trading bitcoin ethereum dogecoin cryptocurrencies deal moving infrastructure package forward includes imposing stricter rules cryptocurrency investors would collect new taxes taxes raising 28 billion according bloomberg money would used partially fund 550 billion spending transportation power grid specifically new rules would require businesses report cryptocurrency transactions valued 10000 would also require brokers report transactions involving digital assets internal internal revenue service cryptocurrency transactions valued 10000 would also require brokers report transactions involving digital assets internal revenue service robinhood begin trading publicly today cryptocurrency broker caters retail investors demographic also active trading cryptocurrencies cryptocurrencies today ipo raise robinhood public profile cryptocurrencies also reacting potential new tax rules result active trading currencies short term including today downward moves'}\n"
          ],
          "name": "stdout"
        }
      ]
    }
  ]
}
